Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add filters

Document Type
Year range
1.
Journal of Image and Graphics ; 27(12):3651-3662, 2022.
Article in Chinese | Scopus | ID: covidwho-2203674

ABSTRACT

Objective In order to alleviate the COVID-19 (corona virus disease 2019) pandemic, the initial implementation is focused on targeting and isolating the infectious patients in time. Traditional PCR (polymerase chain reaction) screening method is challenged for the costly and time-consuming problem. Emerging AI (artificial intelligence) -based deep learning networks have been applied in medical imaging for the COVID-19 diagnosis and pathological lung segmentation nowadays. However, current networks are mostly restricted by the experimental datasets with limited number of chest X-ray (CXR) images, and it merely focuses on a single task of diagnosis or segmentation. Most networks are based on the convolution neural network (CNN). However, the convolution operation of CNN is capable to extract local features derived from intrinsic pixels, and has the long-range dependency constraints for explicitly modeling. We develop a vision transformer network (ViTNet). The multi-head attention (MHA) mechanism is guided for long-range dependency model between pixels. Method We built a novel transformer network called ViTNet for diagnosis and segmentation both. The ViTNet is composed of three parts, including dual-path feature embedding, transformer module and segmentation-oriented feature decoder. 1) The embedded dual-path feature is based on two manners for the embedded CXR inputs. One manner is on the basis of 2D convolution with the sliding step equal to convolution kernel size, which divides a CXR to multiple patches and builds an input vector for each patch. The other manner is concerned of a pre-trained feature map (ResNet34-derived) as backbone in terms of deep CXR-based feature extraction. 2) The transformer module is composed of six encoders and one cross-attention module. The 2D-convolution-generated vector sequence is as inputs for transformer encoder. Owing that the encoder inputs are directly extracted from image pixels, they can be considered as the shallow and intuitive feature of CXR. The six encodes are in sequential, transforming the shallow feature to advanced global feature. The cross-attention module is focused on the results obtained by backbone and transformer encoders as inputs, the network can combine the deep feature and encoded shallow feature, and absorb both the global information and the local information in terms of the encoded shallow feature and deep feature, respectively. 3) The feature decoder for segmentation can double the size of feature map and provide the segmentation results. Our network is required to deal with two tasks simultaneously for both of classification and segmentation. A hybrid loss function is employed for their training, which can balance the training efforts between classification and segmentation. The classification loss is the sum of a contrastive loss and a multi-classes cross-entropy loss. The segmentation loss is a binary cross-entropy loss. What is more, a new five-levels CXR dataset is compiled. The dataset samples are based on 2 951 CXRs of COVID-19, 16 964 CXRs of healthy, 6 103 CXRs of bacterial pneumonia, 5 725 CXRs of viral pneumonia, and 6 723 CXRs of opaque lung. In this dataset, COVID-19 CXRs are all labeled with COVID-19 infected lung masks. In our training process, the input images were resized as 448 × 448 pixels, the learning rate is initially set as 2 × 10 - 4 and decreased gradually in a self-adaptive manner, and the total number of iterations is 200, the Adam learning procedure is conducted on four Tesla K80 GPU devices. Result In the classification experiments, we compared ViTNet to a general transformer network and five popular CNN deep-learning models (i. e., Res-Net18, ResNet50, VGG16 (Visual Geometry Group), Inception _ v3, and deep layer aggregation network (DLAN) in terms of overall prediction accuracy, recall rate, F1 and kappa evaluator. It can be demonstrated that our model has the best with 95. 37% accuracy, followed by Inception_ v3 and DLAN with 95. 17% and 94. 40% accuracy, respectively, and the VGG16 is reached 94. 19% ac uracy. For the recall rate, F1 and kappa value, our model has better performance than the rest of networks as well. For the segmentation experiments, ViTNet is in comparison with four commonly-used segmentation networks like pyramid scene parsing network (PSPNet), U-Net, U-Net + and context encoder network (CE-Net) . The evaluation indicators used are derived of the accuracy, sensitivity, specificity, Dice coefficient and area under ROC (region of interest) curve (AUC) . The experimental results show that our model has its potentials in terms of the accuracy and AUC. The second best sensitivity is performed inferior to U-Net + only. More specifically, our model achieved the 95. 96% accuracy, 78. 89% sensitivity, 97. 97% specificity, 98. 55% AUC and a Dice coefficient of 76. 68% . When it comes to the network efficiency, our model speed is 0. 56 s per CXR. In addition, we demonstrate the segmentation results of six COVID-19 CXR images obtained by all the segmentation networks. It is reflected that our model has the best segmentation performance in terms of the illustration of Fig. 5. Our model limitation is to classify a COVID-19 group as healthy group incorrectly, which is not feasible. The PCR method for COVID-19 is probably more trustable than the deep-leaning method, but the feedback duration of tested result typically needs for 1 or 2 days. Conclusion A novel ViTNet method is developed, which achieves the auto-diagnosis on CXR and lung region segmentation for COVID-19 infection simultaneously. The ViTNet has its priority in diagnosis performance and demonstrate its potential segmentation ability. © 2022 Editorial and Publishing Board of JIG. All rights reserved.

2.
BMC Med Imaging ; 21(1): 112, 2021 07 15.
Article in English | MEDLINE | ID: covidwho-1312621

ABSTRACT

BACKGROUND: Lung region segmentation is an important stage of automated image-based approaches for the diagnosis of respiratory diseases. Manual methods executed by experts are considered the gold standard, but it is time consuming and the accuracy is dependent on radiologists' experience. Automated methods are relatively fast and reproducible with potential to facilitate physician interpretation of images. However, these benefits are possible only after overcoming several challenges. The traditional methods that are formulated as a three-stage segmentation demonstrate promising results on normal CT data but perform poorly in the presence of pathological features and variations in image quality attributes. The implementation of deep learning methods that can demonstrate superior performance over traditional methods is dependent on the quantity, quality, cost and the time it takes to generate training data. Thus, efficient and clinically relevant automated segmentation method is desired for the diagnosis of respiratory diseases. METHODS: We implement each of the three stages of traditional methods using deep learning methods trained on five different configurations of training data with ground truths obtained from the 3D Image Reconstruction for Comparison of Algorithm Database (3DIRCAD) and the Interstitial Lung Diseases (ILD) database. The data was augmented with the Lung Image Database Consortium (LIDC-IDRI) image collection and a realistic phantom. A convolutional neural network (CNN) at the preprocessing stage classifies the input into lung and none lung regions. The processing stage was implemented using a CNN-based U-net while the postprocessing stage utilize another U-net and CNN for contour refinement and filtering out false positives, respectively. RESULTS: The performance of the proposed method was evaluated on 1230 and 1100 CT slices from the 3DIRCAD and ILD databases. We investigate the performance of the proposed method on five configurations of training data and three configurations of the segmentation system; three-stage segmentation and three-stage segmentation without a CNN classifier and contrast enhancement, respectively. The Dice-score recorded by the proposed method range from 0.76 to 0.95. CONCLUSION: The clinical relevance and segmentation accuracy of deep learning models can improve though deep learning-based three-stage segmentation, image quality evaluation and enhancement as well as augmenting the training data with large volume of cheap and quality training data. We propose a new and novel deep learning-based method of contour refinement.


Subject(s)
Deep Learning , Lung/diagnostic imaging , Tomography, X-Ray Computed , Algorithms , Humans , Lung/anatomy & histology , Lung Diseases/diagnostic imaging , Lung Diseases/pathology , Neural Networks, Computer
SELECTION OF CITATIONS
SEARCH DETAIL